Soft Margins for Adaboost Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

نویسنده

  • Takashi Onoda
چکیده

Recently ensemble methods like AdaBoost were successfully applied to character recognition tasks, seemingly defying the problems of overrtting. This paper shows that although AdaBoost rarely overrts in the low noise regime it clearly does so for higher noise levels. Central for understanding this fact is the margin distribution and we nd that AdaBoost achieves { doing gradient descent in an error function with respect to the margin { asymptotically a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns (here an interesting overlap emerge to Support Vectors). This is clearly a sub-optimal strategy in the noisy case, and regularization, i.e. a mistrust in the data, must be introduced in the algorithm to alleviate the distortions that a diicult pattern (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original AdaBoost algorithm to achieve a soft margin { a concept known from Support Vector learning. In particular we suggest (1) regularized AdaBoost Reg using the soft margin directly in a modiied loss function and (2) regular-ized linear and quadratic programming (LP/QP-) AdaBoost, where the soft margin is attained by introducing slack variables. Extensive simulations demonstrate that the proposed regularized Ada-Boost-type algorithms are useful and competitive for noisy data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data-dependent Structural Risk Minimisation for Perceptron Decision Trees Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Perceptron Decision Trees (also known as Linear Machine DTs, etc.) are analysed in order that data-dependent Structural Risk Minimization can be applied. Data-dependent analysis is performed which indicates that choosing the maximal margin hyperplanes at the decision nodes will improve the generalization. The analysis uses a novel technique to bound the generalization error in terms of the marg...

متن کامل

Discrete versus Analog Computation: Aspects of Studying the Same Problem in Diierent Computational Models Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

In this tutorial we want to outline some of the features coming up when analyzing the same computational problems in diierent complexity theoretic frameworks. We will focus on two problems; the rst related to mathematical optimization and the second dealing with the intrinsic structure of complexity classes. Both examples serve well for working out in how far diierent approaches to the same pro...

متن کامل

Multiplicative Updatings for Support-vector Learning Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Support Vector machines nd maximal margin hyperplanes in a high dimensional feature space. Theoretical results exist which guarantee a high generalization performance when the margin is large or when the number of support vectors is small. Multiplicative-Updating algorithms are a new tool for perceptron learning whose theoretical properties are well studied. In this work we present a Multiplica...

متن کامل

Dynamically Adapting Kernels in Support Vector Machines Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

The kernel-parameter is one of the few tunable parameters in Support Vector machines, and controls the complexity of the resulting hypothesis. The choice of its value amounts to model selection, and is usually performed by means of a validation set. We present an algorithm which can automatically perform model selection and learning with no additional computational cost and with no need of a va...

متن کامل

Latent Semantic Kernels for Feature Selection Produced as Part of the Esprit Working Group in Neural and Computational Learning Ii, Neurocolt2 27150

Latent Semantic Indexing is a method for selecting informative subspaces of feature spaces. It was developed for information retrieval to reveal semantic information from document co-occurrences. The paper demonstrates how this method can be implemented implicitly to a kernel deened feature space and hence adapted for application to any kernel based learning algorithm and data. Experiments with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998